Goto

Collaborating Authors

 content knowledge


Using Large Language Models to Assess Teachers' Pedagogical Content Knowledge

Yang, Yaxuan, Wang, Shiyu, Zhai, Xiaoming

arXiv.org Artificial Intelligence

Assessing teachers' pedagogical content knowledge (PCK) through performance-based tasks is both time and effort-consuming. While large language models (LLMs) offer new opportunities for efficient automatic scoring, little is known about whether LLMs introduce construct-irrelevant variance (CIV) in ways similar to or different from traditional machine learning (ML) and human raters. This study examines three sources of CIV -- scenario variability, rater severity, and rater sensitivity to scenario -- in the context of video-based constructed-response tasks targeting two PCK sub-constructs: analyzing student thinking and evaluating teacher responsiveness. Using generalized linear mixed models (GLMMs), we compared variance components and rater-level scoring patterns across three scoring sources: human raters, supervised ML, and LLM. Results indicate that scenario-level variance was minimal across tasks, while rater-related factors contributed substantially to CIV, especially in the more interpretive Task II. The ML model was the most severe and least sensitive rater, whereas the LLM was the most lenient. These findings suggest that the LLM contributes to scoring efficiency while also introducing CIV as human raters do, yet with varying levels of contribution compared to supervised ML. Implications for rater training, automated scoring design, and future research on model interpretability are discussed.


Multilingual Performance of a Multimodal Artificial Intelligence System on Multisubject Physics Concept Inventories

Kortemeyer, Gerd, Babayeva, Marina, Polverini, Giulia, Gregorcic, Bor, Widenhorn, Ralf

arXiv.org Artificial Intelligence

We investigate the multilingual and multimodal performance of a large language model-based artificial intelligence (AI) system, GPT-4o, on a diverse set of physics concept inventories spanning multiple languages and subject areas. The inventories taken from the PhysPort website cover the classical physics topics of mechanics, electromagnetism, optics, and thermodynamics as well as relativity, quantum mechanics, astronomy, mathematics, and laboratory skills. Unlike previous text-only studies, we uploaded the inventories as images mirroring what a student would see on paper, assessing the system's multimodal functionality. The AI is prompted in English and autonomously chooses the language of its response - either remaining in the nominal language of the test, switching entirely to English, or mixing languages - revealing adaptive behavior dependent on linguistic complexity and data availability. Our results indicate some variation in performance across subject areas, with laboratory skills standing out as the area of poorest performance. Furthermore, the AI's performance on questions that require visual interpretation of images is worse than on purely text-based questions. Questions that are difficult for the AI tend to be that way invariably of the inventory language. We also find large variations in performance across languages, with some appearing to benefit substantially from language switching, a phenomenon similar to code-switching ofhuman speakers. Overall, comparing the obtained AI results to the existing literature, we find that the AI system outperforms average undergraduate students post-instruction in all subject areas but laboratory skills.


Representing Pedagogic Content Knowledge Through Rough Sets

Mani, A

arXiv.org Artificial Intelligence

A teacher's knowledge base consists of knowledge of mathematics content, knowledge of student epistemology, and pedagogical knowledge. It has severe implications on the understanding of student's knowledge of content, and the learning context in general. The necessity to formalize the different content knowledge in approximate senses is recognized in the education research literature. A related problem is that of coherent formalizability. Existing responsive or smart AI-based software systems do not concern themselves with meaning, and trained ones are replete with their own issues. In the present research, many issues in modeling teachers' understanding of content are identified, and a two-tier rough set-based model is proposed by the present author for the purpose of developing software that can aid the varied tasks of a teacher. The main advantage of the proposed approach is in its ability to coherently handle vagueness, granularity and multi-modality. An extended example to equational reasoning is used to demonstrate these. The paper is meant for rough set researchers intending to build logical models or develop meaning-aware AI-software to aid teachers, and education research experts.


Why we should train workers like we train machine learning algorithms

#artificialintelligence

The evolution of workforce opportunity in the United States depends on the future of education and our commitment to far-reaching, equitable federal reform. Unfortunately, policy conversations at the federal and state levels about transforming education systems to meet future workforce demands have focused disproportionately on a skills agenda, largely ignoring behavioral competencies that often complement and enhance the value of technical skills. This misguided approach equates 21st-century workforce development with skills acquisition, which only serves to reinforce a two-tiered workforce: those who are best positioned to acquire and monetize their skills will be granted mobility and long-term security while all others continue to be stranded on the bottom rung of the socioeconomic ladder. Developing intelligent policies to combat workforce inequality requires acknowledging that employer demand for "skills" actually refers to a constellation of content knowledge, technical abilities, and applied intelligence. Per the National Association of Colleges and Employers' 2018 Job Outlook survey, eight out of 10 employers reported that applicants' problem-solving and teamwork abilities influenced hiring decisions; only six out of 10 employers reported the same for technical skills.


#TimTalks - How AI Will Impact Your Organisation with regards to Content/Knowledge

#artificialintelligence

In this Tim Talks I discuss with Neill Horie @NeillHorie and DLA's AI expert, Katie King @katieeking on how AI will impact an organisation the challenges as well as the opportunities. For example AL can be a unifying technology for the silos of marketing, SEO, social, customer support etc.


Personalized Guided Tour by Multiple Robots through Semantic Profile Definition and Dynamic Redistribution of Participants

Hristoskova, Anna (Ghent University) | Aguero, Carlos (Universidad Rey Juan Carlos) | Veloso, Manuela (Carnegie Mellon University) | Turck, Filip De (Ghent University)

AAAI Conferences

Existing robot guides are able to offer a tour of a building, such as a museum, bank, science center, to a single person or to a group of participants. Usually the tours are predefined and there is no support for dynamic interactions between multiple robots. This paper focuses on distributed collaboration between several robot guides providing a building tour to groups of participants. Semantic techniques are adopted in order to formally define the tour topics, available content on a specific topic, and the robot and human profiles including their interests and content knowledge. The robot guides select different topics depending on their participants' interests and prior knowledge. Optimization of the topics of interests is achieved through exchange of participants between the robot guides whenever in each others neighborhood. Evaluation of the implemented algorithms presents a 90% content coverage of relevant topics for the individual participants.